Picture for Timo Gerkmann

Timo Gerkmann

Department of Informatics, University of Hamburg, Hamburg, Germany

Adaptive Rotary Steering with Joint Autoregression for Robust Extraction of Closely Moving Speakers in Dynamic Scenarios

Add code
Jan 21, 2026
Viaarxiv icon

Bone-conduction Guided Multimodal Speech Enhancement with Conditional Diffusion Models

Add code
Jan 18, 2026
Viaarxiv icon

Real-Time Streamable Generative Speech Restoration with Flow Matching

Add code
Dec 22, 2025
Figure 1 for Real-Time Streamable Generative Speech Restoration with Flow Matching
Figure 2 for Real-Time Streamable Generative Speech Restoration with Flow Matching
Figure 3 for Real-Time Streamable Generative Speech Restoration with Flow Matching
Figure 4 for Real-Time Streamable Generative Speech Restoration with Flow Matching
Viaarxiv icon

Real-Time Streaming Mel Vocoding with Generative Flow Matching

Add code
Sep 18, 2025
Figure 1 for Real-Time Streaming Mel Vocoding with Generative Flow Matching
Figure 2 for Real-Time Streaming Mel Vocoding with Generative Flow Matching
Figure 3 for Real-Time Streaming Mel Vocoding with Generative Flow Matching
Viaarxiv icon

Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance

Add code
Jul 03, 2025
Viaarxiv icon

ReverbFX: A Dataset of Room Impulse Responses Derived from Reverb Effect Plugins for Singing Voice Dereverberation

Add code
May 26, 2025
Viaarxiv icon

Steering Deep Non-Linear Spatially Selective Filters for Weakly Guided Extraction of Moving Speakers in Dynamic Scenarios

Add code
May 20, 2025
Viaarxiv icon

Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement

Add code
May 08, 2025
Viaarxiv icon

FlowDec: A flow-based full-band general audio codec with high perceptual quality

Add code
Mar 03, 2025
Figure 1 for FlowDec: A flow-based full-band general audio codec with high perceptual quality
Figure 2 for FlowDec: A flow-based full-band general audio codec with high perceptual quality
Figure 3 for FlowDec: A flow-based full-band general audio codec with high perceptual quality
Figure 4 for FlowDec: A flow-based full-band general audio codec with high perceptual quality
Viaarxiv icon

Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation

Add code
Oct 25, 2024
Figure 1 for Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Figure 2 for Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Figure 3 for Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Viaarxiv icon